Name		Name	Last commit message	Last commit date
parent directory ..
datasources		datasources
deploy/0.0.1		deploy/0.0.1
pipes		pipes
scripts		scripts
tests		tests
.tinyenv		.tinyenv
README.md		README.md
requirements.txt		requirements.txt

README.md

Recover data from quarantine

When data ends up in quarantine, it is possible to re-ingest it using a Copy Pipe. Create a pull request following these steps:

Create a new pipe to select, fix and copy the quarantine rows. In our case it looks like analytics_events_quarantine_to_final.pipe:

NODE copy_quarantine
SQL >
    SELECT
        toDateTime(
            fromUnixTimestamp64Milli(toUInt64(assumeNotNull(timestamp)) * 1000)
        ) timestamp,
        assumeNotNull(session_id) session_id,
        assumeNotNull(action) action,
        assumeNotNull(version) version,
        assumeNotNull(payload) payload
    FROM analytics_events_quarantine

TYPE COPY
TARGET_DATASOURCE analytics_events

Create a custom deployment 0.0.1
In the custom deployment file deploy.sh:
- Let's append incorrect data to analytics_events using a fixture (that's required to create the quarantine Data Source)
```
  set +e
  tb datasource append analytics_events datasources/fixtures/analytics_events_errors.ndjson
  set -e
```
- Don't forget set +e command when the incorrect data is being appended, if not the pipeline will finish with error.
- Run the copy Pipe to fix the incorrect data and re-ingest it into analytics_events
```
tb pipe copy run analytics_events_quarantine_to_final --wait --yes
```
- You can also add a test to check that the copy is working and you get data in analytics_events. In this case we're looking for a row we know exists in the quarantine.
```
output=$(tb sql "SELECT * FROM analytics_events WHERE session_id == 'b7b1965c-620a-402a-afe5-2d0eea0f9a34'")
if [[ $output == *"No rows"* ]]; then
    echo "Information was not copied from quarantine to final Data Source 'analytics_events'"
    exit 1
fi
```
Once you test the copy Pipe in CI you can get rid of the custom deployment and merge the Pull Request.
After the changes are merged you can run the copy Pipe in the main Workspace.
```
tb pipe copy run analytics_events_quarantine_to_final --wait --yes
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recover_data_from_quarantine

recover_data_from_quarantine

README.md

Recover data from quarantine

Files

recover_data_from_quarantine

Directory actions

More options

Directory actions

More options

Latest commit

History

recover_data_from_quarantine

Folders and files

parent directory

README.md

Recover data from quarantine