diff --git a/README.md b/README.md
index 73240b1..7001815 100644
--- a/README.md
+++ b/README.md
@@ -6,10 +6,10 @@
 # NxBench
 
 <p align="center">
-  <img src="doc/_static/nxbench_logo.png" alt="NxBench Logo" width="150"/>
+  <img src="doc/_static/assets/nxbench_logo.png" alt="NxBench Logo" width="150"/>
 </p>
 
-**nxbench** is a comprehensive benchmarking suite designed to facilitate comparative profiling of graph analytic algorithms across NetworkX and compatible backends. Built with an emphasis on extensibility and detailed performance analysis, nxbench aims to enable developers and researchers to optimize their graph analysis workflows efficiently and reproducibly.
+**nxbench** is a comprehensive benchmarking suite designed to facilitate comparative profiling of graph analytic algorithms across NetworkX and compatible backends. Built on top of [Airspeed Velocity (ASV)](https://github.com/airspeed-velocity/asv), nxbench places an emphasis on extensible and granular performance analysis, enabling developers and researchers to optimize their graph analysis workflows efficiently and reproducibly.
 
 ## Key Features
 
@@ -31,7 +31,13 @@ PyPi:
 pip install nxbench
 ```
 
-From a local clone, try Docker:
+From a local clone:
+
+```bash
+make install
+```
+
+Docker:
 
 ```bash
 # CPU-only
@@ -73,9 +79,14 @@ nxbench benchmark export 'results/results.csv' --output-format csv  # convert be
 4. View results:
 
 ```bash
-nxbench viz serve  # launch interactive dashboard
+nxbench viz serve  # visualize results using parallel categories dashboard
 ```
 
+<p align="center">
+  <img src="doc/_static/assets/animation.gif" alt="Parallel Categories Animation" width="1000"/>
+</p>
+
+
 ## Advanced Command Line Interface
 
 The CLI provides comprehensive management of benchmarks, datasets, and visualization:
@@ -90,7 +101,7 @@ nxbench --config 'nxbench/configs/example.yaml' -vvv benchmark run  # debug benc
 nxbench benchmark export 'results/benchmarks.sqlite' --output-format sql # export the results into a sql database
 
 # Visualization
-nxbench viz serve  # launch parallel categories dashboard
+nxbench viz serve  # visualize results using parallel categories dashboard
 nxbench viz publish  # generate static asv report
 ```
 
diff --git a/doc/_static/assets/animation.gif b/doc/_static/assets/animation.gif
new file mode 100644
index 0000000..28a51f7
Binary files /dev/null and b/doc/_static/assets/animation.gif differ
diff --git a/doc/_static/assets/favicon.ico b/doc/_static/assets/favicon.ico
new file mode 100644
index 0000000..95dbf59
Binary files /dev/null and b/doc/_static/assets/favicon.ico differ
diff --git a/doc/_static/assets/nxbench_logo.png b/doc/_static/assets/nxbench_logo.png
new file mode 100644
index 0000000..f3654a7
Binary files /dev/null and b/doc/_static/assets/nxbench_logo.png differ
diff --git a/doc/_static/favicon.ico b/doc/_static/favicon.ico
deleted file mode 100644
index 2915ca4..0000000
Binary files a/doc/_static/favicon.ico and /dev/null differ
diff --git a/doc/_static/nxbench_logo.png b/doc/_static/nxbench_logo.png
deleted file mode 100644
index ecb197a..0000000
Binary files a/doc/_static/nxbench_logo.png and /dev/null differ
diff --git a/doc/conf.py b/doc/conf.py
index b9f3166..00bae67 100644
--- a/doc/conf.py
+++ b/doc/conf.py
@@ -61,8 +61,8 @@
 html_title = "NxBench"
 html_baseurl = "https://networkx.org"
 html_copy_source = False
-html_favicon = "_static/favicon.ico"
-html_logo = "_static/nxbench_logo.png"
+html_favicon = "_static/assets/favicon.ico"
+html_logo = "_static/assets/nxbench_logo.png"
 html_theme_options = {
     # "gtag": "G-XXXXXXXXXX",
     "source_repository": "https://github.com/dpys/nxbench/",
diff --git a/doc/examples.md b/doc/examples.md
index 3c28006..a25fb8c 100644
--- a/doc/examples.md
+++ b/doc/examples.md
@@ -166,27 +166,26 @@ matrix:
 
 ---
 
-## **5. ASV Configuration**
+## **5. Environment Configuration**
 
 ### **Purpose**
 
-Configures ASV-specific settings, including the repository URL, branches to benchmark, and required dependencies.
+Configures environment settings, such as the python and dependency versions.
 
 ### **Fields**
 
-- **`repo`** *(string, required)*: URL of the Git repository to benchmark.
-- **`branches`** *(list of strings, required)*: Specifies the branches in the repository to benchmark.
+- **`python`** *(list of strings, required)*: List of valid python versions (e.g. "3.10", "3.11")
 - **`req`** *(list of strings, required)*: Lists the Python dependencies required for benchmarking.
 
 ### **Example Entry**
 
 ```yaml
-asv_config:
-  repo: "https://github.com/dpys/nxbench.git"
-  branches:
-    - "main"
+env_config:
   req:
     - "networkx==3.4.2"
     - "nx_parallel==0.3"
     - "graphblas_algorithms==2023.10.0"
+  pythons:
+    - "3.10"
+    - "3.11"
 ```
diff --git a/doc/index.rst b/doc/index.rst
index c9a79b1..723bbbf 100644
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -3,7 +3,7 @@ Welcome to NxBench's Documentation
 
 Overview
 ========
-**nxbench** is a comprehensive benchmarking suite designed to facilitate comparative profiling of graph analytic algorithms across NetworkX and compatible backends. Built with an emphasis on extensibility and detailed performance analysis, nxbench aims to enable developers and researchers to optimize their graph analysis workflows efficiently and reproducibly.
+is a comprehensive benchmarking suite designed to facilitate comparative profiling of graph analytic algorithms across NetworkX and compatible backends. Built on top of [Airspeed Velocity (ASV)](https://github.com/airspeed-velocity/asv), nxbench places an emphasis on extensible and granular performance analysis, enabling developers and researchers to optimize their graph analysis workflows efficiently and reproducibly.
 
 Key Features
 ============
diff --git a/doc/installation.md b/doc/installation.md
index 2d7ce82..a7a017f 100644
--- a/doc/installation.md
+++ b/doc/installation.md
@@ -6,7 +6,13 @@ PyPi:
 pip install nxbench
 ```
 
-From a local clone, try Docker:
+From a local clone:
+
+```bash
+make install
+```
+
+Docker:
 
 ```bash
 # CPU-only
diff --git a/nxbench/benchmarks/benchmark.py b/nxbench/benchmarks/benchmark.py
index 2fdefbe..10c8947 100644
--- a/nxbench/benchmarks/benchmark.py
+++ b/nxbench/benchmarks/benchmark.py
@@ -6,6 +6,7 @@
 import traceback
 import warnings
 from functools import partial
+from importlib import import_module
 from typing import Any
 
 import networkx as nx
@@ -14,6 +15,9 @@
 from nxbench.benchmarks.utils import (
     get_available_backends,
     get_benchmark_config,
+    is_graphblas_available,
+    is_nx_cugraph_available,
+    is_nx_parallel_available,
     memory_tracker,
 )
 from nxbench.data.loader import BenchmarkDataManager
@@ -36,7 +40,7 @@ def generate_benchmark_methods(cls):
     """Generate benchmark methods dynamically for each combination of algorithm,
     backend, and number of threads without redundant executions.
     """
-    config = get_benchmark_config()
+    config = cls.config
     algorithms = config.algorithms
     datasets = [ds.name for ds in config.datasets]
     available_backends = get_available_backends()
@@ -102,6 +106,8 @@ def track_method(self):
 class GraphBenchmark:
     """Base class for all graph algorithm benchmarks."""
 
+    config = get_benchmark_config()
+
     def __init__(self):
         self.data_manager = BenchmarkDataManager()
         self.graphs = {}
@@ -110,12 +116,11 @@ def setup_cache(self):
         """Cache graph data for benchmarks."""
         self.graphs = {}
 
-        config = get_benchmark_config()
-        datasets = [ds.name for ds in config.datasets]
+        datasets = [ds.name for ds in self.config.datasets]
 
         for dataset_name in datasets:
             dataset_config = next(
-                (ds for ds in config.datasets if ds.name == dataset_name),
+                (ds for ds in self.config.datasets if ds.name == dataset_name),
                 None,
             )
             if dataset_config is None:
@@ -160,37 +165,52 @@ def prepare_benchmark(
             f"{original_graph.number_of_edges()} edges"
         )
 
-        try:
-            if backend == "networkx":
-                converted_graph = original_graph
-            elif "parallel" in backend:
-                os.environ["NUM_THREAD"] = str(num_thread)
-                os.environ["OMP_NUM_THREADS"] = str(num_thread)
-                os.environ["MKL_NUM_THREADS"] = str(num_thread)
-                os.environ["OPENBLAS_NUM_THREADS"] = str(num_thread)
-
-                nx.config.backends.parallel.active = True
-                nx.config.backends.parallel.n_jobs = num_thread
-                converted_graph = original_graph
-            elif "cugraph" in backend:
-                import cugraph
+        for var_name in [
+            "NUM_THREAD",
+            "OMP_NUM_THREADS",
+            "MKL_NUM_THREADS",
+            "OPENBLAS_NUM_THREADS",
+        ]:
+            os.environ[var_name] = str(num_thread)
+
+        if backend == "networkx":
+            return original_graph
+
+        if "parallel" in backend and is_nx_parallel_available():
+            try:
+                nxp = import_module("nx_parallel")
+            except ImportError:
+                logger.exception("nx-parallel backend not available")
+                return None
+            return nxp.ParallelGraph(original_graph)
 
+        if "cugraph" in backend and is_nx_cugraph_available():
+            try:
+                cugraph = import_module("nx_cugraph")
+            except ImportError:
+                logger.exception("cugraph backend not available")
+                return None
+            try:
                 edge_attr = "weight" if nx.is_weighted(original_graph) else None
-                converted_graph = cugraph.from_networkx(
-                    original_graph, edge_attrs=edge_attr
-                )
-            elif "graphblas" in backend:
-                import graphblas_algorithms as ga
+                return cugraph.from_networkx(original_graph, edge_attrs=edge_attr)
+            except Exception:
+                logger.exception("Error converting graph to cugraph format")
+                return None
 
-                converted_graph = ga.Graph.from_networkx(original_graph)
-            else:
-                logger.error(f"Unsupported backend: {backend}")
+        if "graphblas" in backend and is_graphblas_available():
+            try:
+                ga = import_module("graphblas_algorithms")
+            except ImportError:
+                logger.exception("graphblas_algorithms backend not available")
+                return None
+            try:
+                return ga.Graph.from_networkx(original_graph)
+            except Exception:
+                logger.exception("Error converting graph to graphblas format")
                 return None
-        except Exception:
-            logger.exception("Error in prepare_benchmark")
-            return None
         else:
-            return converted_graph
+            logger.error(f"Unsupported backend: {backend}")
+            return None
 
     def do_benchmark(
         self,
@@ -210,11 +230,10 @@ def do_benchmark(
 
         try:
             algo_func = get_algorithm_function(algo_config, backend)
-            alg_func_name = (
-                algo_func.func.__name__
-                if hasattr(algo_func, "func")
-                else algo_func.__name__
-            )
+            if isinstance(algo_func, partial):
+                alg_func_name = algo_func.func.__name__
+            else:
+                alg_func_name = algo_func.__name__
             logger.debug(f"Got algorithm function: {alg_func_name}")
         except (ImportError, AttributeError):
             logger.exception(f"Function not available for backend {backend}")
diff --git a/nxbench/benchmarks/config.py b/nxbench/benchmarks/config.py
index 8bd09bc..5e1be86 100644
--- a/nxbench/benchmarks/config.py
+++ b/nxbench/benchmarks/config.py
@@ -94,7 +94,8 @@ class BenchmarkConfig:
     datasets: list[DatasetConfig]
     matrix: dict[str, Any]
     machine_info: dict[str, Any] = field(default_factory=dict)
-    output_dir: Path = field(default_factory=lambda: Path("../results"))
+    output_dir: Path = field(default_factory=lambda: Path("~/results"))
+    env_data: dict[str, Any] = field(default_factory=dict)
 
     @classmethod
     def from_yaml(cls, path: str | Path) -> "BenchmarkConfig":
@@ -134,6 +135,8 @@ def from_yaml(cls, path: str | Path) -> "BenchmarkConfig":
             logger.error(f"'matrix' should be a dict in the config file: {path}")
             matrix_data = {}
 
+        env_data = data.get("env_config") or {}
+
         algorithms = [AlgorithmConfig(**algo_data) for algo_data in algorithms_data]
 
         datasets = [DatasetConfig(**ds_data) for ds_data in datasets_data]
@@ -143,7 +146,8 @@ def from_yaml(cls, path: str | Path) -> "BenchmarkConfig":
             datasets=datasets,
             matrix=matrix_data,
             machine_info=data.get("machine_info", {}),
-            output_dir=Path(data.get("output_dir", "../results")),
+            output_dir=Path(data.get("output_dir", "~/results")),
+            env_data=env_data,
         )
 
     def to_yaml(self, path: str | Path) -> None:
diff --git a/nxbench/benchmarks/tests/test_benchmark.py b/nxbench/benchmarks/tests/test_benchmark.py
index d9e5b8c..c330184 100644
--- a/nxbench/benchmarks/tests/test_benchmark.py
+++ b/nxbench/benchmarks/tests/test_benchmark.py
@@ -25,17 +25,8 @@ def mock_benchmark():
         algorithms = [mock_algo]
 
         mock_matrix = {
-            "req": {
-                "networkx": ["3.3"],
-                "nx_parallel": ["0.2"],
-                "python-graphblas": ["2024.2.0"],
-            },
-            "env": {
-                "NUM_THREAD": ["1", "4", "8"],
-                "OMP_NUM_THREADS": ["1"],
-                "MKL_NUM_THREADS": ["1"],
-                "OPENBLAS_NUM_THREADS": ["1"],
-            },
+            "backend": ["networkx", "parallel"],
+            "num_threads": ["1", "4", "8"],
         }
 
         mock_config = MagicMock()
@@ -43,6 +34,8 @@ def mock_benchmark():
         mock_config.algorithms = algorithms
         mock_config.matrix = mock_matrix
 
+        mock_get_config.return_value = mock_config
+
         MockDataManager.return_value.load_network_sync.return_value = (
             nx.Graph(),
             {"metadata": "test"},
@@ -51,16 +44,13 @@ def mock_benchmark():
         import nxbench.benchmarks.benchmark
 
         importlib.reload(nxbench.benchmarks.benchmark)
-        from nxbench.benchmarks.benchmark import GraphBenchmark
 
-        GraphBenchmark.params = [
-            [ds.name for ds in datasets],
-            ["networkx", "parallel"],
-            [1, 4, 8],
-        ]
+        @nxbench.benchmarks.benchmark.generate_benchmark_methods
+        class MockGraphBenchmark(nxbench.benchmarks.benchmark.GraphBenchmark):
+            config = mock_config
 
-        benchmark_instance = GraphBenchmark()
-        benchmark_instance.config = mock_config
+        benchmark_instance = MockGraphBenchmark()
+        benchmark_instance.setup()
 
         return benchmark_instance
 
@@ -73,9 +63,11 @@ def mock_backends():
 
 
 def test_backend_selection(mock_backends, mock_benchmark):
-    assert "networkx" in mock_benchmark.params[1]
-    assert "parallel" in mock_benchmark.params[1]
-    assert "cugraph" not in mock_benchmark.params[1]
+    """Test that available backends are correctly identified."""
+    config = mock_benchmark.config
+    available_backends = ["networkx", "parallel"]
+    assert all(backend in available_backends for backend in config.matrix["backend"])
+    assert "cugraph" not in config.matrix["backend"]
 
 
 def test_graph_benchmark_initialization(mock_benchmark):
@@ -86,27 +78,23 @@ def test_graph_benchmark_initialization(mock_benchmark):
 
 def test_setup_cache(mock_benchmark):
     """Test that setup_cache populates graphs from the configuration."""
-    with patch("nxbench.benchmarks.benchmark.get_benchmark_config") as mock_get_config:
-        mock_get_config.return_value = mock_benchmark.config
-
-        mock_benchmark.data_manager.load_network_sync = MagicMock(
-            return_value=(nx.Graph(), {"metadata": "test"})
-        )
+    mock_benchmark.data_manager.load_network_sync = MagicMock(
+        return_value=(nx.Graph(), {"metadata": "test"})
+    )
 
-        mock_benchmark.setup_cache()
+    mock_benchmark.setup_cache()
 
-        assert len(mock_benchmark.graphs) == 2
-        for dataset in ["test_dataset1", "test_dataset2"]:
-            assert dataset in mock_benchmark.graphs
-            graph, metadata = mock_benchmark.graphs[dataset]
-            assert isinstance(graph, nx.Graph)
-            assert metadata == {"metadata": "test"}
+    assert len(mock_benchmark.graphs) == 2
+    for dataset in ["test_dataset1", "test_dataset2"]:
+        assert dataset in mock_benchmark.graphs
+        graph, metadata = mock_benchmark.graphs[dataset]
+        assert isinstance(graph, nx.Graph)
+        assert metadata == {"metadata": "test"}
 
 
 def test_setup_failure(mock_benchmark):
-    """Test setup_cache for failure when a dataset is not found."""
+    """Test setup_cache for failure when a dataset cannot be loaded."""
     mock_benchmark.graphs = {}
-
     mock_benchmark.data_manager.load_network_sync.side_effect = Exception(
         "Failed to load dataset"
     )
@@ -122,8 +110,14 @@ def test_prepare_benchmark_unsupported_backend(mock_benchmark):
     assert result is None
 
 
+def test_prepare_benchmark_missing_dataset(mock_benchmark):
+    """Test prepare_benchmark when the dataset is not found in the cache."""
+    result = mock_benchmark.prepare_benchmark("non_existent_dataset", "networkx")
+    assert result is None
+
+
 def test_do_benchmark_setup_failure(mock_benchmark):
-    """Test the do_benchmark method when setup fails."""
+    """Test the do_benchmark method when setup fails (dataset not found)."""
     mock_algo_config = mock_benchmark.config.algorithms[0]
 
     metrics = mock_benchmark.do_benchmark(
@@ -133,6 +127,61 @@ def test_do_benchmark_setup_failure(mock_benchmark):
     assert math.isnan(metrics["memory_used"])
 
 
+def test_do_benchmark_func_ref_none(mock_benchmark):
+    """Test do_benchmark when algo_config.func_ref is None, causing ImportError."""
+    mock_algo_config = AlgorithmConfig(name="dummy_algo", func="dummy.module.func")
+    mock_algo_config.func_ref = None  # Simulate func_ref being None
+
+    metrics = mock_benchmark.do_benchmark(
+        mock_algo_config, "test_dataset1", "networkx", 1
+    )
+    assert math.isnan(metrics["execution_time"])
+    assert math.isnan(metrics["memory_used"])
+
+
+def test_do_benchmark_algo_execution_exception(mock_benchmark):
+    """Test do_benchmark when exception occurs during algorithm execution."""
+    mock_algo_config = mock_benchmark.config.algorithms[0]
+    mock_algo_config.func_ref.side_effect = Exception("Algorithm failed")
+
+    # Prepare a valid graph
+    mock_benchmark.prepare_benchmark = MagicMock(return_value=nx.Graph())
+
+    metrics = mock_benchmark.do_benchmark(
+        mock_algo_config, "test_dataset1", "networkx", 1
+    )
+
+    assert math.isnan(metrics["execution_time"])
+    assert math.isnan(metrics["memory_used"])
+
+
+def test_do_benchmark_validation_failure(mock_benchmark):
+    """Test do_benchmark when validation fails."""
+    mock_algo_config = mock_benchmark.config.algorithms[0]
+    mock_algo_config.func_ref = MagicMock(return_value={"result": "some_value"})
+    mock_algo_config.func_ref.__name__ = "dummy_algo_func"
+
+    # Prepare a valid graph and update the graphs dictionary
+    test_graph = nx.Graph()
+    mock_benchmark.prepare_benchmark = MagicMock(return_value=test_graph)
+    mock_benchmark.graphs["test_dataset1"] = (test_graph, {"metadata": "test"})
+
+    # Mock the BenchmarkValidator to raise an exception during validation
+    with patch("nxbench.benchmarks.benchmark.BenchmarkValidator") as MockValidator:
+        mock_validator = MockValidator.return_value
+        mock_validator.validate_result.side_effect = Exception("Validation failed")
+
+        metrics = mock_benchmark.do_benchmark(
+            mock_algo_config, "test_dataset1", "networkx", 1
+        )
+
+        # Even if validation fails, metrics should be returned
+        assert "execution_time" in metrics
+        assert "memory_used" in metrics
+        assert not math.isnan(metrics["execution_time"])
+        assert not math.isnan(metrics["memory_used"])
+
+
 def test_get_algorithm_function_networkx(mock_benchmark):
     """Test get_algorithm_function for the networkx backend."""
     from nxbench.benchmarks.benchmark import get_algorithm_function
@@ -145,8 +194,22 @@ def test_get_algorithm_function_networkx(mock_benchmark):
     assert func == algo_config.func_ref
 
 
+def test_get_algorithm_function_func_ref_none(mock_benchmark):
+    """Test get_algorithm_function when func_ref is None."""
+    from nxbench.benchmarks.benchmark import get_algorithm_function
+    from nxbench.benchmarks.config import AlgorithmConfig
+
+    algo_config = AlgorithmConfig(name="dummy_algo", func="dummy.module.func")
+    algo_config.func_ref = None
+
+    with pytest.raises(ImportError):
+        get_algorithm_function(algo_config, "networkx")
+
+
 def test_get_algorithm_function_other_backend(mock_benchmark):
     """Test get_algorithm_function for non-networkx backends."""
+    from functools import partial
+
     from nxbench.benchmarks.benchmark import get_algorithm_function
     from nxbench.benchmarks.config import AlgorithmConfig
 
@@ -154,6 +217,7 @@ def test_get_algorithm_function_other_backend(mock_benchmark):
     algo_config.func_ref = MagicMock(name="dummy_func_ref")
 
     func = get_algorithm_function(algo_config, "parallel")
+    assert isinstance(func, partial)
     assert func.func == algo_config.func_ref
     assert func.keywords["backend"] == "parallel"
 
@@ -170,18 +234,36 @@ def test_process_algorithm_params(mock_benchmark):
 
 def test_process_algorithm_params_with_function(mock_benchmark):
     """Test processing algorithm parameters with a function reference."""
+    import math
+
     from nxbench.benchmarks.benchmark import process_algorithm_params
 
     params = {"_pos_arg": 42, "func_ref": {"func": "math.sqrt"}}
     pos_args, kwargs = process_algorithm_params(params)
-    import math
-
     assert pos_args == [42]
     assert "func_ref" in kwargs
     assert callable(kwargs["func_ref"])
     assert kwargs["func_ref"] == math.sqrt
 
 
+def test_process_algorithm_params_func_import_error(mock_benchmark):
+    """Test processing algorithm parameters when importing a function fails."""
+    from nxbench.benchmarks.benchmark import process_algorithm_params
+
+    params = {"_pos_arg": 42, "func_ref": {"func": "nonexistent.module.func"}}
+    with pytest.raises(ImportError):
+        process_algorithm_params(params)
+
+
+def test_process_algorithm_params_func_attribute_error(mock_benchmark):
+    """Test processing algorithm parameters when the function does not exist."""
+    from nxbench.benchmarks.benchmark import process_algorithm_params
+
+    params = {"_pos_arg": 42, "func_ref": {"func": "math.nonexistent_func"}}
+    with pytest.raises(AttributeError):
+        process_algorithm_params(params)
+
+
 @pytest.mark.parametrize(
     ("backend_name", "expected"),
     [
@@ -193,5 +275,28 @@ def test_process_algorithm_params_with_function(mock_benchmark):
 )
 def test_backend_availability(mock_benchmark, backend_name, expected):
     """Test the availability of different backends."""
-    available = backend_name in mock_benchmark.params[1]
+    available = backend_name in mock_benchmark.config.matrix["backend"]
     assert available == expected
+
+
+def test_generated_benchmark_methods_exist(mock_benchmark):
+    """Test that the generated benchmark methods exist on the GraphBenchmark
+    instance.
+    """
+    methods = [attr for attr in dir(mock_benchmark) if attr.startswith("track_")]
+    # Get the expected methods
+    expected_methods = set()
+    config = mock_benchmark.config
+    algorithms = config.algorithms
+    datasets = [ds.name for ds in config.datasets]
+    backends = config.matrix["backend"]
+    num_threads = [int(n) for n in config.matrix["num_threads"]]
+
+    for algo in algorithms:
+        for dataset in datasets:
+            for backend in backends:
+                for num_thread in num_threads:
+                    method_name = f"track_{algo.name}_{dataset}_{backend}_{num_thread}"
+                    expected_methods.add(method_name)
+
+    assert set(methods) >= expected_methods
diff --git a/nxbench/benchmarks/tests/test_config.py b/nxbench/benchmarks/tests/test_config.py
index 4f61901..7fac970 100644
--- a/nxbench/benchmarks/tests/test_config.py
+++ b/nxbench/benchmarks/tests/test_config.py
@@ -325,19 +325,14 @@ def test_get_benchmark_config_load_default(self):
             default_config = load_default_config()
             assert config == default_config
 
-        assert len(config.algorithms) == 2
+        assert len(config.algorithms) == 1
         assert config.algorithms[0].name == "pagerank"
-        assert config.algorithms[1].name == "louvain_communities"
 
-        assert len(config.datasets) == 8
+        assert len(config.datasets) == 4
         assert config.datasets[0].name == "08blocks"
         assert config.datasets[1].name == "jazz"
         assert config.datasets[2].name == "karate"
-        assert config.datasets[3].name == "patentcite"
-        assert config.datasets[4].name == "IMDB"
-        assert config.datasets[5].name == "citeseer"
-        assert config.datasets[6].name == "enron"
-        assert config.datasets[7].name == "twitter"
+        assert config.datasets[3].name == "enron"
 
 
 class TestBenchmarkResult:
diff --git a/nxbench/benchmarks/tests/test_utils.py b/nxbench/benchmarks/tests/test_utils.py
index 494c563..5abaa80 100644
--- a/nxbench/benchmarks/tests/test_utils.py
+++ b/nxbench/benchmarks/tests/test_utils.py
@@ -7,8 +7,8 @@
     get_available_backends,
     get_benchmark_config,
     get_python_version,
-    is_cugraph_available,
     is_graphblas_available,
+    is_nx_cugraph_available,
     is_nx_parallel_available,
 )
 
@@ -28,13 +28,13 @@ def test_backend_availability():
     with patch("importlib.util.find_spec") as mock_find_spec:
         # test when backends are available
         mock_find_spec.return_value = True
-        assert is_cugraph_available() is True
+        assert is_nx_cugraph_available() is True
         assert is_graphblas_available() is True
         assert is_nx_parallel_available() is True
 
         # test when backends are not available
         mock_find_spec.return_value = None
-        assert is_cugraph_available() is False
+        assert is_nx_cugraph_available() is False
         assert is_graphblas_available() is False
         assert is_nx_parallel_available() is False
 
@@ -42,7 +42,7 @@ def test_backend_availability():
 def test_get_available_backends():
     """Test getting list of available backends."""
     with (
-        patch("nxbench.benchmarks.utils.is_cugraph_available") as mock_cugraph,
+        patch("nxbench.benchmarks.utils.is_nx_cugraph_available") as mock_cugraph,
         patch("nxbench.benchmarks.utils.is_graphblas_available") as mock_graphblas,
         patch("nxbench.benchmarks.utils.is_nx_parallel_available") as mock_parallel,
     ):
diff --git a/nxbench/benchmarks/utils.py b/nxbench/benchmarks/utils.py
index 0a964ec..c1056b8 100644
--- a/nxbench/benchmarks/utils.py
+++ b/nxbench/benchmarks/utils.py
@@ -32,11 +32,20 @@ def get_benchmark_config() -> BenchmarkConfig:
 
     config_file = os.getenv("NXBENCH_CONFIG_FILE")
     if config_file:
-        if not Path(config_file).exists():
-            raise FileNotFoundError(f"Config file not found: {config_file}")
-        _BENCHMARK_CONFIG = BenchmarkConfig.from_yaml(config_file)
+        config_path = Path(config_file)
+
+        if not config_path.is_absolute():
+            config_path = (Path.cwd() / config_path).resolve()
+
+        if not config_path.exists():
+            raise FileNotFoundError(f"Config file not found: {config_path}")
+
+        logger.debug(f"Resolved config file path: {config_path}")
+
+        _BENCHMARK_CONFIG = BenchmarkConfig.from_yaml(str(config_path))
     else:
         _BENCHMARK_CONFIG = load_default_config()
+
     return _BENCHMARK_CONFIG
 
 
@@ -47,31 +56,21 @@ def load_default_config() -> BenchmarkConfig:
             func="networkx.algorithms.link_analysis.pagerank_alg.pagerank",
             params={"alpha": 0.85},
         ),
-        AlgorithmConfig(
-            name="louvain_communities",
-            func="networkx.algorithms.community.louvain.louvain_communities",
-            requires_undirected=True,
-        ),
     ]
     default_datasets = [
         DatasetConfig(name="08blocks", source="networkrepository"),
         DatasetConfig(name="jazz", source="networkrepository"),
         DatasetConfig(name="karate", source="networkrepository"),
-        DatasetConfig(name="patentcite", source="networkrepository"),
-        DatasetConfig(name="IMDB", source="networkrepository"),
-        DatasetConfig(name="citeseer", source="networkrepository"),
         DatasetConfig(name="enron", source="networkrepository"),
-        DatasetConfig(name="twitter", source="networkrepository"),
     ]
 
     default_matrix = {
         "req": {
             "networkx": ["3.4.2"],
-            "nx-parallel": ["0.3"],
-            "python-graphblas": ["2024.2.0"],
+            "graphblas_algorithms": ["2023.10.0"],
         },
         "env_nobuild": {
-            "NUM_THREAD": ["1", "4", "8"],
+            "NUM_THREAD": ["1", "4"],
         },
     }
     return BenchmarkConfig(
@@ -79,17 +78,16 @@ def load_default_config() -> BenchmarkConfig:
         datasets=default_datasets,
         matrix=default_matrix,
         machine_info={},
-        output_dir=Path("../results"),
     )
 
 
-def is_cugraph_available():
+def is_nx_cugraph_available():
     try:
         import importlib.util
     except ImportError:
         return False
     else:
-        return importlib.util.find_spec("cugraph") is not None
+        return importlib.util.find_spec("nx_cugraph") is not None
 
 
 def is_graphblas_available():
@@ -98,7 +96,7 @@ def is_graphblas_available():
     except ImportError:
         return False
     else:
-        return importlib.util.find_spec("graphblas") is not None
+        return importlib.util.find_spec("graphblas_algorithms") is not None
 
 
 def is_nx_parallel_available():
@@ -119,7 +117,7 @@ def get_python_version() -> str:
 def get_available_backends() -> list[str]:
     backends = ["networkx"]
 
-    if is_cugraph_available():
+    if is_nx_cugraph_available():
         backends.append("cugraph")
 
     if is_graphblas_available():
diff --git a/nxbench/cli.py b/nxbench/cli.py
index bffcd19..e4452d9 100644
--- a/nxbench/cli.py
+++ b/nxbench/cli.py
@@ -11,8 +11,10 @@
 
 import click
 import pandas as pd
+import requests
 
 from nxbench.benchmarks.config import DatasetConfig
+from nxbench.benchmarks.utils import get_benchmark_config
 from nxbench.data.loader import BenchmarkDataManager
 from nxbench.data.repository import NetworkRepository
 from nxbench.log import _config as package_config
@@ -31,6 +33,50 @@ def validate_executable(path: str | Path) -> Path:
     return executable
 
 
+def get_latest_commit_hash(github_url: str) -> str:
+    """
+    Fetch the latest commit hash from a GitHub repository.
+
+    Parameters
+    ----------
+    github_url : str
+        The URL of the GitHub repository.
+
+    Returns
+    -------
+    str
+        The latest commit hash.
+
+    Raises
+    ------
+    ValueError
+        If the URL is invalid or the API request fails.
+    """
+    if "github.com" not in github_url:
+        raise ValueError("Provided URL is not a valid GitHub URL")
+
+    parts = github_url.strip("/").split("/")
+    if len(parts) < 2:
+        raise ValueError(
+            "GitHub URL must be in the format 'https://github.com/owner/repo'"
+        )
+
+    owner, repo = parts[-2], parts[-1]
+
+    api_url = f"https://api.github.com/repos/{owner}/{repo}/commits"
+
+    try:
+        response = requests.get(api_url, timeout=3)
+        response.raise_for_status()
+        data = response.json()
+        if not data and isinstance(data, list):
+            raise ValueError("No commit data found for the repository")
+    except requests.RequestException:
+        raise ValueError("Error fetching commit data")
+    else:
+        return data[0]["sha"]
+
+
 def safe_run(
     cmd: Sequence[str | Path],
     check: bool = True,
@@ -156,7 +202,8 @@ def has_git(project_root):
 
 
 def run_asv_command(
-    args: Sequence[str], check: bool = True, use_commit_hash: bool = True
+    args: Sequence[str],
+    results_dir: Path | None = None,
 ) -> subprocess.CompletedProcess:
     """Run ASV command with dynamic asv.conf.json based on DVCS presence."""
     asv_path = get_asv_executable()
@@ -198,7 +245,21 @@ def run_asv_command(
     except FileNotFoundError as e:
         raise click.ClickException(str(e))
 
-    config_data["pythons"] = [str(get_python_executable())]
+    env_data = get_benchmark_config().env_data
+    config_data["pythons"] = env_data["pythons"]
+    config_data["req"] = env_data["req"]
+
+    if results_dir:
+        config_data["results_dir"] = str(results_dir)
+        logger.debug(f"Set results_dir to: {results_dir}")
+    else:
+        default_results_dir = Path.cwd() / "results"
+        config_data["results_dir"] = str(default_results_dir.resolve())
+        logger.debug(
+            "Set results_dir to default 'results' in current working directory."
+        )
+
+    config_data["html_dir"] = str(Path(config_data["results_dir"]).parent / "html")
 
     with tempfile.TemporaryDirectory() as tmpdir:
         temp_config_path = Path(tmpdir) / "asv.conf.json"
@@ -218,16 +279,18 @@ def run_asv_command(
             safe_args = ["--config", str(temp_config_path), *safe_args]
             logger.debug(f"Added --config {temp_config_path} to ASV arguments.")
 
-        if use_commit_hash and _has_git:
-            try:
-                git_hash = get_git_hash(project_root)
-                if git_hash != "unknown":
-                    safe_args.append(f"--set-commit-hash={git_hash}")
-                    logger.debug(f"Set commit hash to: {git_hash}")
-            except subprocess.CalledProcessError:
-                logger.warning(
-                    "Could not determine git commit hash. Proceeding without it."
-                )
+        if _has_git:
+            git_hash = get_git_hash(project_root)
+        else:
+            git_hash = get_latest_commit_hash(config_data["project_url"])
+
+        try:
+            safe_args.append(f"--set-commit-hash={git_hash}")
+            logger.debug(f"Set commit hash to: {git_hash}")
+        except subprocess.CalledProcessError:
+            logger.warning(
+                "Could not determine git commit hash. Proceeding without it."
+            )
 
         old_cwd = Path.cwd()
         if _has_git:
@@ -237,7 +300,7 @@ def run_asv_command(
         try:
             asv_command = [str(asv_path), *safe_args]
             logger.debug(f"Executing ASV command: {' '.join(map(str, asv_command))}")
-            return safe_run(asv_command)
+            completed_process = safe_run(asv_command)
         except subprocess.CalledProcessError:
             logger.exception("ASV command failed.")
             raise click.ClickException("ASV command failed.")
@@ -248,6 +311,7 @@ def run_asv_command(
             if _has_git:
                 os.chdir(old_cwd)
                 logger.debug(f"Restored working directory to: {old_cwd}")
+        return completed_process
 
 
 @click.group()
@@ -257,9 +321,17 @@ def run_asv_command(
     type=click.Path(exists=True, dir_okay=False, path_type=Path),
     help="Path to config file.",
 )
+@click.option(
+    "--output-dir",
+    type=click.Path(file_okay=False, writable=True, path_type=Path),
+    default=Path.cwd(),
+    show_default=True,
+    help="Directory to store benchmark results.",
+)
 @click.pass_context
-def cli(ctx, verbose: int, config: Path | None):
+def cli(ctx, verbose: int, config: Path | None, output_dir: Path):
     """NetworkX Benchmarking Suite CLI."""
+    # Set verbosity level
     if verbose >= 2:
         verbosity_level = 2
     elif verbose == 1:
@@ -273,11 +345,24 @@ def cli(ctx, verbose: int, config: Path | None):
     logging.basicConfig(level=log_level)
 
     if config:
-        os.environ["NXBENCH_CONFIG_FILE"] = str(config)
-        logger.info(f"Using config file: {config}")
+        absolute_config = config.resolve()
+        os.environ["NXBENCH_CONFIG_FILE"] = str(absolute_config)
+        logger.info(f"Using config file: {absolute_config}")
+
+    try:
+        results_dir = output_dir / "results"
+        results_dir.mkdir(parents=True, exist_ok=True)
+        logger.debug(f"Results directory is set to: {results_dir.resolve()}")
+    except Exception:
+        logger.exception(f"Failed to create results directory '{results_dir}'")
+        raise click.ClickException(
+            f"Failed to create results directory '{results_dir}'"
+        )
 
     ctx.ensure_object(dict)
     ctx.obj["CONFIG"] = config
+    ctx.obj["OUTPUT_DIR"] = output_dir.resolve()
+    ctx.obj["RESULTS_DIR"] = results_dir.resolve()
 
 
 @cli.group()
@@ -355,19 +440,17 @@ def benchmark(ctx):
     help="Backends to benchmark. Specify multiple values to run for multiple backends.",
 )
 @click.option("--collection", type=str, default="all", help="Graph collection to use.")
-@click.option(
-    "--use-commit-hash/--no-commit-hash",
-    default=False,
-    help="Whether to use git commit hash for benchmarking.",
-)
 @click.pass_context
-def run_benchmark(ctx, backend: tuple[str], collection: str, use_commit_hash: bool):
+def run_benchmark(ctx, backend: tuple[str], collection: str):
     """Run benchmarks."""
     config = ctx.obj.get("CONFIG")
+    output_dir = ctx.obj.get("OUTPUT_DIR", Path.cwd())
+    results_dir = ctx.obj.get("RESULTS_DIR", output_dir / "results")
+
     if config:
         logger.debug(f"Config file used for benchmark run: {config}")
 
-    cmd_args = ["run", "--quick"]
+    cmd_args = ["run"]
 
     if package_config.verbosity_level >= 1:
         cmd_args.append("--verbose")
@@ -386,7 +469,10 @@ def run_benchmark(ctx, backend: tuple[str], collection: str, use_commit_hash: bo
     cmd_args.append("--python=same")
 
     try:
-        run_asv_command(cmd_args, use_commit_hash=use_commit_hash)
+        run_asv_command(
+            cmd_args,
+            results_dir=results_dir,
+        )
     except subprocess.CalledProcessError:
         logger.exception("Benchmark run failed")
         raise click.ClickException("Benchmark run failed")
@@ -404,10 +490,13 @@ def run_benchmark(ctx, backend: tuple[str], collection: str, use_commit_hash: bo
 def export(ctx, result_file: Path, output_format: str):
     """Export benchmark results."""
     config = ctx.obj.get("CONFIG")
+    output_dir = ctx.obj.get("OUTPUT_DIR", Path.cwd())
+    results_dir = ctx.obj.get("RESULTS_DIR", output_dir / "results")
+
     if config:
         logger.debug(f"Using config file for export: {config}")
 
-    dashboard = BenchmarkDashboard(results_dir="results")
+    dashboard = BenchmarkDashboard(results_dir=str(results_dir))
 
     try:
         if output_format == "sql":
@@ -457,7 +546,7 @@ def compare(ctx, baseline: str, comparison: str, threshold: float):
         "-f",
         str(threshold),
     ]
-    run_asv_command(cmd_args, check=False)
+    run_asv_command(cmd_args)
 
 
 @cli.group()
@@ -486,6 +575,9 @@ def serve(ctx, port: int, debug: bool):
 def publish(ctx):
     """Generate static benchmark report."""
     config = ctx.obj.get("CONFIG")
+    output_dir = ctx.obj.get("OUTPUT_DIR", Path.cwd())
+    results_dir = ctx.obj.get("RESULTS_DIR", output_dir / "results")
+
     if config:
         logger.debug(f"Config file used for viz publish: {config}")
 
@@ -504,14 +596,14 @@ def publish(ctx):
         raise click.ClickException("Script path must be within project directory")
 
     try:
-        safe_run([python_path, process_script, "--results_dir", "results"])
+        safe_run([python_path, str(process_script), "--results_dir", str(results_dir)])
         logger.info("Successfully processed results.")
     except (subprocess.SubprocessError, ValueError) as e:
         logger.exception("Failed to process results")
         raise click.ClickException(str(e))
 
-    run_asv_command(["publish", "--verbose"], check=False)
-    dashboard = BenchmarkDashboard()
+    run_asv_command(["publish", "--verbose"], results_dir=results_dir)
+    dashboard = BenchmarkDashboard(results_dir=str(results_dir))
     dashboard.generate_static_report()
 
 
diff --git a/nxbench/configs/asv.conf.json b/nxbench/configs/asv.conf.json
index 064384c..d4c0b4f 100644
--- a/nxbench/configs/asv.conf.json
+++ b/nxbench/configs/asv.conf.json
@@ -3,14 +3,15 @@
     "project": "nxbench",
     "timeout": 3000,
     "project_url": "https://github.com/dpys/nxbench",
-    "repo": ".",
+    "dvcs": "git",
+    "branches": [
+        "main"
+    ],
+    "repo": "https://github.com/dpys/nxbench",
     "environment_type": "virtualenv",
     "show_commit_url": "https://github.com/dpys/nxbench/commit/",
     "matrix": {},
     "benchmark_dir": "nxbench/benchmarks",
-    "env_dir": "env",
-    "results_dir": "results",
-    "html_dir": "html",
     "hash_length": 8,
     "plugins": [
         "asv_runner"
diff --git a/nxbench/configs/dummy.yaml b/nxbench/configs/dummy.yaml
new file mode 100644
index 0000000..9258547
--- /dev/null
+++ b/nxbench/configs/dummy.yaml
@@ -0,0 +1,43 @@
+algorithms:
+  - name: "pagerank"
+    func: "networkx.pagerank"
+    params:
+      alpha: 0.9
+      tol: 1.0e-6
+    requires_directed: false
+    groups: ["centrality", "random_walk"]
+    min_rounds: 10
+    warmup: true
+    warmup_iterations: 50
+
+datasets:
+  - name: "erdos_renyi_small"
+    source: "generator"
+    params:
+      generator: "networkx.erdos_renyi_graph"
+      n: 1000
+      p: 0.01
+    metadata:
+      directed: false
+      weighted: false
+
+validation:
+  skip_slow: false
+  validate_all: true
+  error_on_fail: true
+  report_memory: true
+
+matrix:
+  backend:
+    - "networkx"
+    - "graphblas"
+  num_threads:
+    - "1"
+
+env_config:
+  repo: "https://github.com/dpys/nxbench.git"
+  branches:
+    - "main"
+  req:
+    - "networkx==3.4.2"
+    - "graphblas_algorithms==2023.10.0"
diff --git a/nxbench/configs/example.yaml b/nxbench/configs/example.yaml
index c3de8d0..4fc6e3a 100644
--- a/nxbench/configs/example.yaml
+++ b/nxbench/configs/example.yaml
@@ -1,14 +1,23 @@
 algorithms:
-  # - name: "pagerank"
-  #   func: "networkx.pagerank"
-  #   params:
-  #     alpha: 0.9
-  #     tol: 1.0e-6
-  #   requires_directed: false
-  #   groups: ["centrality", "random_walk"]
-  #   min_rounds: 10
-  #   warmup: true
-  #   warmup_iterations: 50
+  - name: "pagerank"
+    func: "networkx.pagerank"
+    params:
+      alpha: 0.9
+      tol: 1.0e-6
+    requires_directed: false
+    groups: ["centrality", "random_walk"]
+    min_rounds: 10
+    warmup: true
+    warmup_iterations: 50
+
+  - name: "eigenvector_centrality"
+    func: "networkx.eigenvector_centrality"
+    requires_directed: false
+    groups: ["centrality", "path_based"]
+    min_rounds: 5
+    warmup: true
+    warmup_iterations: 20
+    validate_result: "nxbench.validation.validate_node_scores"
 
   # - name: "betweenness_centrality"
   #   func: "networkx.betweenness_centrality"
@@ -20,7 +29,7 @@ algorithms:
   #   min_rounds: 5
   #   warmup: true
   #   warmup_iterations: 20
-    # validate_result: "nxbench.validation.validate_node_scores"
+  #   validate_result: "nxbench.validation.validate_node_scores"
 
   # - name: "edge_betweenness_centrality"
   #   func: "networkx.edge_betweenness_centrality"
@@ -40,12 +49,29 @@ algorithms:
   #   groups: ["connectivity", "approximation"]
   #   min_rounds: 3
 
+  - name: "average_clustering"
+    func: "networkx.average_clustering"
+    params: {}
+    requires_directed: false
+    groups: ["clustering", "graph_structure"]
+    min_rounds: 3
+    validate_result: "nxbench.validation.validate_scalar_result"
+
   - name: "square_clustering"
     func: "networkx.square_clustering"
     params: {}
     requires_directed: false
     groups: ["clustering", "graph_structure"]
     min_rounds: 3
+    validate_result: "nxbench.validation.validate_node_scores"
+
+  - name: "transitivity"
+    func: "networkx.transitivity"
+    params: {}
+    requires_directed: false
+    groups: ["clustering", "graph_structure"]
+    min_rounds: 3
+    validate_result: "nxbench.validation.validate_scalar_result"
 
   # - name: "all_pairs_node_connectivity"
   #   func: "networkx.algorithms.connectivity.connectivity.all_pairs_node_connectivity"
@@ -76,12 +102,12 @@ algorithms:
   #   groups: ["paths", "all_pairs"]
   #   min_rounds: 3
 
-  # - name: "all_pairs_shortest_path_length"
-  #   func: "networkx.all_pairs_shortest_path_length"
-  #   params: {}
-  #   requires_directed: false
-  #   groups: ["paths", "distance"]
-  #   min_rounds: 3
+  - name: "all_pairs_shortest_path_length"
+    func: "networkx.all_pairs_shortest_path_length"
+    params: {}
+    requires_directed: false
+    groups: ["paths", "distance"]
+    min_rounds: 3
 
   # - name: "all_pairs_shortest_path"
   #   func: "networkx.all_pairs_shortest_path"
@@ -150,27 +176,15 @@ datasets:
     source: "networkrepository"
     params: {}
 
-  - name: "patentcite"
-    source: "networkrepository"
-    params: {}
-
-  - name: "citeseer"
-    source: "networkrepository"
-    params: {}
-
-  - name: "twitter"
-    source: "networkrepository"
-    params: {}
-
-  # - name: "erdos_renyi_small"
-  #   source: "generator"
-  #   params:
-  #     generator: "networkx.erdos_renyi_graph"
-  #     n: 1000
-  #     p: 0.01
-  #   metadata:
-  #     directed: false
-  #     weighted: false
+  - name: "erdos_renyi_small"
+    source: "generator"
+    params:
+      generator: "networkx.erdos_renyi_graph"
+      n: 1000
+      p: 0.01
+    metadata:
+      directed: false
+      weighted: false
 
   # - name: "watts_strogatz_medium"
   #   source: "generator"
@@ -213,18 +227,17 @@ validation:
 matrix:
   backend:
     - "networkx"
-    - "parallel"
     - "graphblas"
   num_threads:
     - "1"
+    - "2"
     - "4"
     - "8"
 
-asv_config:
-  repo: "https://github.com/dpys/nxbench.git"
-  branches:
-    - "main"
+env_config:
   req:
     - "networkx==3.4.2"
-    - "nx_parallel==0.3"
     - "graphblas_algorithms==2023.10.0"
+  pythons:
+    - "3.10"
+    - "3.11"
diff --git a/nxbench/log.py b/nxbench/log.py
index 8b97d63..8be05b1 100644
--- a/nxbench/log.py
+++ b/nxbench/log.py
@@ -265,8 +265,7 @@ def set_verbosity_level(self, level: int) -> None:
                     LoggingHandlerConfig(
                         handler_type="console",
                         level=log_level,
-                        formatter="%(asctime)s - %(name)s - %(levelname)s - % "
-                        "(message)s",
+                        formatter="%(asctime)s - %(name)s - %(levelname)s - %(message)s",  # noqa: E501
                     )
                 ],
             )
diff --git a/nxbench/validation/__init__.py b/nxbench/validation/__init__.py
index e69de29..9b5ed21 100644
--- a/nxbench/validation/__init__.py
+++ b/nxbench/validation/__init__.py
@@ -0,0 +1 @@
+from .base import *