Skip to content

Commit

Permalink
Release 3.0 docs (#3770)
Browse files Browse the repository at this point in the history
Co-authored-by: Pavel Semyonov <[email protected]>
Co-authored-by: Anna Balaeva <[email protected]>
Co-authored-by: TarantoolBot <[email protected]>
Co-authored-by: Kseniia Antonova <[email protected]>
  • Loading branch information
5 people authored Dec 27, 2023
1 parent ffefe88 commit b02b66e
Show file tree
Hide file tree
Showing 870 changed files with 155,437 additions and 5,016 deletions.
34 changes: 34 additions & 0 deletions .github/workflows/push-pot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Push POTs
on:
push:
branches:
- '3.0'
permissions:
contents: write
jobs:
generate-pot:
runs-on: ubuntu-latest
container: tarantool/doc-builder:fat-4.3
steps:
- uses: actions/checkout@v3

- name: Generate Portable Object Templates
run: |
cmake .
make update-pot
- name: Commit generated pots
run: |
git config --global --add safe.directory /__w/doc/doc
git config --global user.name 'TarantoolBot'
git config --global user.email '[email protected]'
if [[ $(git status) =~ .*"nothing to commit".* ]]; then
echo "status=nothing-to-commit"
exit 0
fi
git add locale/en
git commit -m "updated pot"
git push origin 3.0
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ webhooks/.env

locale/*
!locale/ru
!locale/en

# redundant folders created by sphinx

Expand Down
6 changes: 0 additions & 6 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,9 +1,3 @@
[submodule "modules/cartridge"]
path = modules/cartridge
url = https://github.com/tarantool/cartridge.git
[submodule "modules/cartridge-cli"]
path = modules/cartridge-cli
url = https://github.com/tarantool/cartridge-cli.git
[submodule "modules/metrics"]
path = modules/metrics
url = https://github.com/tarantool/metrics.git
Expand Down
44 changes: 0 additions & 44 deletions build_submodules.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,50 +12,6 @@ po_dest="${project_root}/locale/ru/LC_MESSAGES"
cp README.rst doc/contributing/docs/_includes/README.rst


# Cartridge
cartridge_root="${project_root}/modules/cartridge"

# Build Cartridge to extract docs
cd "${cartridge_root}" || exit
CMAKE_DUMMY_WEBUI=true tarantoolctl rocks make

# Copy Cartridge docs, including diagrams and images
cartridge_rst_src="${cartridge_root}/build.luarocks/build.rst"
cartridge_rst_dest="${project_root}/doc/book/cartridge"
cd "${cartridge_rst_src}" || exit
mkdir -p "${cartridge_rst_dest}"
find . -iregex '.*\.\(rst\|png\|puml\|svg\)$' -exec cp -r --parents {} "${cartridge_rst_dest}" \;

# Copy translation templates
cartridge_pot_src="${cartridge_root}/build.luarocks/build.rst/locale"
cartridge_pot_dest="${project_root}/locale/book/cartridge"
cd "${cartridge_pot_src}" || exit
mkdir -p "${cartridge_pot_dest}"
find . -name '*.pot' -exec cp -rv --parents {} "${cartridge_pot_dest}" \;

# Copy translations
cartridge_po_src="${cartridge_root}/build.luarocks/build.rst/locale/ru/LC_MESSAGES"
cartridge_po_dest="${po_dest}/book/cartridge"
cd "${cartridge_po_src}" || exit
mkdir -p "${cartridge_po_dest}"
find . -name '*.po' -exec cp -rv --parents {} "${cartridge_po_dest}" \;


# Cartridge CLI
cartridge_cli_root="${project_root}/modules/cartridge-cli/doc"
cartridge_cli_dest="${cartridge_rst_dest}/cartridge_cli"
cartridge_cli_po_dest="${po_dest}/book/cartridge/cartridge_cli"

# Copy Cartridge CLI docs, including diagrams and images
mkdir -p "${cartridge_cli_dest}"
cd ${cartridge_cli_root} || exit
find . -iregex '.*\.\(rst\|png\|puml\|svg\)$' -exec cp -rv --parents {} "${cartridge_cli_dest}" \;

# Copy translations
mkdir -p "${cartridge_cli_po_dest}"
cd "${cartridge_cli_root}/locale/ru/LC_MESSAGES/doc/" || exit
find . -name '*.po' -exec cp -rv --parents {} "${cartridge_cli_po_dest}" \;

# Monitoring
monitoring_root="${project_root}/modules/metrics/doc/monitoring"
monitoring_dest="${project_root}/doc/book"
Expand Down
6 changes: 1 addition & 5 deletions conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
project = u'Tarantool'

# |release| The full version, including alpha/beta/rc tags.
release = "2.11.1"
release = "3.0.0"
# |version| The short X.Y version.
version = '.'.join(release.split('.')[0:2])

Expand All @@ -73,10 +73,6 @@
'how-to/using_docker.rst',
'reference/configuration/cfg_*',
'images',
'book/cartridge/cartridge_overview.rst',
'book/cartridge/CONTRIBUTING.rst',
'book/cartridge/topics',
'book/cartridge/cartridge_api/modules/cartridge.test-helpers.rst',
'reference/reference_rock/luatest/README.rst',
'reference/reference_rock/luatest/modules/luatest.rst',
'**/_includes/*'
Expand Down
1 change: 0 additions & 1 deletion doc/alternate_build_master.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
how-to/index
concepts/index
CRUD operations <reference/reference_lua/box_space>
book/cartridge/index
book/admin/index
book/connectors
enterprise/index
Expand Down
Binary file added doc/book/admin/admin_instances_dev.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/book/admin/admin_instances_prod.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
185 changes: 110 additions & 75 deletions doc/book/admin/disaster_recovery.rst
Original file line number Diff line number Diff line change
@@ -1,126 +1,161 @@
.. _admin-disaster_recovery:

================================================================================
Disaster recovery
================================================================================
=================

The minimal fault-tolerant Tarantool configuration would be a
:ref:`replication cluster<replication-topologies>`
The minimal fault-tolerant Tarantool configuration would be a :ref:`replica set <replication-architecture>`
that includes a master and a replica, or two masters.
The basic recommendation is to configure all Tarantool instances in a replica set to create :ref:`snapshot files <index-box_persistence>` on a regular basis.

The basic recommendation is to configure all Tarantool instances in a cluster to
create :ref:`snapshot files <index-box_persistence>` at a regular basis.
Here are action plans for typical crash scenarios.

Here follow action plans for typical crash scenarios.

.. _admin-disaster_recovery-master_replica:

--------------------------------------------------------------------------------
Master-replica
--------------------------------------------------------------------------------
--------------

Configuration: One master and one replica.
.. _admin-disaster_recovery-master_replica_manual_failover:

Problem: The master has crashed.
Master crash: manual failover
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Your actions:
**Configuration:** master-replica (:ref:`manual failover <replication-master_replica_bootstrap>`).

1. Ensure the master is stopped for good. For example, log in to the master
machine and use ``systemctl stop tarantool@<instance_name>``.
**Problem:** The master has crashed.

2. Switch the replica to master mode by setting
:ref:`box.cfg.read_only <cfg_basic-read_only>` parameter to *false* and let
the load be handled by the replica (effective master).
**Actions:**

3. Set up a replacement for the crashed master on a spare host, with
:ref:`replication <cfg_replication-replication>` parameter set to replica
(effective master), so it begins to catch up with the new master’s state.
The new instance should have :ref:`box.cfg.read_only <cfg_basic-read_only>`
parameter set to *true*.
1. Ensure the master is stopped.
For example, log in to the master machine and use ``tt stop``.

You lose the few transactions in the master
:ref:`write ahead log file <index-box_persistence>`, which it may have not
transferred to the replica before crash. If you were able to salvage the master
.xlog file, you may be able to recover these. In order to do it:
2. Configure a new replica set leader using the :ref:`<replicaset_name>.leader <configuration_reference_replicasets_name_leader>` option.

1. Find out the position of the crashed master, as reflected on the new master.
3. Reload configuration on all instances using :ref:`config:reload() <config-module>`.

a. Find out instance UUID from the crashed master :ref:`xlog <internals-wal>`:
4. Make sure that a new replica set leader is a master using :ref:`box.info.ro <box_introspection-box_info>`.

.. code-block:: console
5. On a new master, :ref:`remove a crashed instance from the '_cluster' space <replication-remove_instances-remove_cluster>`.

$ head -5 *.xlog | grep Instance
Instance: ed607cad-8b6d-48d8-ba0b-dae371b79155
6. Set up a replacement for the crashed master on a spare host.

b. On the new master, use the UUID to find the position:
See also: :ref:`Performing manual failover <replication-controlled_failover>`.

.. code-block:: tarantoolsession

tarantool> box.info.vclock[box.space._cluster.index.uuid:select{'ed607cad-8b6d-48d8-ba0b-dae371b79155'}[1][1]]
---
- 23425
<...>
.. _admin-disaster_recovery-master_replica_auto_failover:

2. Play the records from the crashed .xlog to the new master, starting from the
new master position:
Master crash: automated failover
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

a. Issue this request locally at the new master's machine to find out
instance ID of the new master:
**Configuration:** master-replica (:ref:`automated failover <replication-bootstrap-auto>`).

.. code-block:: tarantoolsession
**Problem:** The master has crashed.

tarantool> box.space._cluster:select{}
---
- - [1, '88580b5c-4474-43ab-bd2b-2409a9af80d2']
...
**Actions:**

b. Play the records to the new master:
1. Use ``box.info.election`` to make sure a new master is elected automatically.

.. code-block:: console
2. On a new master, :ref:`remove a crashed instance from the '_cluster' space <replication-remove_instances-remove_cluster>`.

3. Set up a replacement for the crashed master on a spare host.

See also: :ref:`Testing automated failover <replication-automated-failover-testing>`.


.. _admin-disaster_recovery-master_replica_data_loss:

Data loss
~~~~~~~~~

**Configuration:** master-replica.

**Problem:** Some transactions are missing on a replica after the master has crashed.

**Actions:**

You lose a few transactions in the master
:ref:`write-ahead log file <index-box_persistence>`, which may have not
transferred to the replica before the crash. If you were able to salvage the master
``.xlog`` file, you may be able to recover these.

1. Find out instance UUID from the crashed master :ref:`xlog <internals-wal>`:

.. code-block:: console
$ head -5 var/lib/instance001/*.xlog | grep Instance
Instance: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
2. On the new master, use the UUID to find the position:

.. code-block:: tarantoolsession
app:instance002> box.info.vclock[box.space._cluster.index.uuid:select{'9bb111c2-3ff5-36a7-00f4-2b9a573ea660'}[1][1]]
---
- 999
...
3. :ref:`Play the records <tt-play>` from the crashed ``.xlog`` to the new master, starting from the
new master position:

.. code-block:: console
$ tt play 127.0.0.1:3302 var/lib/instance001/00000000000000000000.xlog \
--from 1000 \
--replica 1 \
--username admin --password secret
$ tt play <new_master_uri> <xlog_file> --from 23425 --replica 1
.. _admin-disaster_recovery-master_master:

--------------------------------------------------------------------------------
Master-master
--------------------------------------------------------------------------------
-------------

**Configuration:** :ref:`master-master <replication-bootstrap-master-master>`.

**Problem:** one master has crashed.

Configuration: Two masters.
**Actions:**

Problem: Master#1 has crashed.
1. Let the load be handled by another master alone.

Your actions:
2. Remove a crashed master from a replica set.

1. Let the load be handled by master#2 (effective master) alone.
3. Set up a replacement for the crashed master on a spare host.
Learn more from :ref:`Adding and removing instances <replication-master-master-add-remove-instances>`.

2. Follow the same steps as in the
:ref:`master-replica <admin-disaster_recovery-master_replica>` recovery scenario
to create a new master and salvage lost data.

.. _admin-disaster_recovery-data_loss:

--------------------------------------------------------------------------------
Data loss
--------------------------------------------------------------------------------
Master-replica/master-master: data loss
---------------------------------------

**Configuration:** master-replica or master-master.

**Problem:** Data was deleted at one master and this data loss was propagated to the other node (master or replica).

**Actions:**

1. Put all nodes in read-only mode.
Depending on the :ref:`replication.failover <configuration_reference_replication_failover>` mode, this can be done as follows:

- ``manual``: change a replica set leader to ``null``.
- ``election``: set :ref:`replication.election_mode <configuration_reference_replication_election_mode>` to ``voter`` or ``off`` at the replica set level.
- ``off``: set ``database.mode`` to ``ro``.

Configuration: Master-master or master-replica.
Reload configurations on all instances using the ``reload()`` function provided by the :ref:`config <config-module>` module.

Problem: Data was deleted at one master and this data loss was propagated to the
other node (master or replica).
2. Turn off deletion of expired checkpoints with :doc:`/reference/reference_lua/box_backup/start`.
This prevents the Tarantool garbage collector from removing files
made with older checkpoints until :doc:`/reference/reference_lua/box_backup/stop` is called.

The following steps are applicable only to data in memtx storage engine.
Your actions:
3. Get the latest valid :ref:`.snap file <internals-snapshot>` and
use ``tt cat`` command to calculate at which LSN the data loss occurred.

1. Put all nodes in :ref:`read-only mode <cfg_basic-read_only>` and disable
deletion of expired checkpoints with :doc:`/reference/reference_lua/box_backup/start`.
This will prevent the Tarantool garbage collector from removing files
made with older checkpoints until :doc:`/reference/reference_lua/box_backup/stop` is called.
4. Start a new instance and use :ref:`tt play <tt-play>` command to
play to it the contents of ``.snap`` and ``.xlog`` files up to the calculated LSN.

2. Get the latest valid :ref:`.snap file <internals-snapshot>` and
use ``tt cat`` command to calculate at which lsn the data loss occurred.
5. Bootstrap a new replica from the recovered master.

3. Start a new instance (instance#1) and use ``tt play`` command to
play to it the contents of .snap/.xlog files up to the calculated lsn.
.. NOTE::

4. Bootstrap a new replica from the recovered master (instance#1).
The steps above are applicable only to data in the memtx storage engine.
Loading

0 comments on commit b02b66e

Please sign in to comment.