-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixes to the "English-to-Spanish Translation with a Sequence-to-Seque…
…nce Transformer" Code Example (#1997) * bugfix: Encoder and decoder inputs were flipped. Given 30 epochs of training, the model never ended producing sensible output. These are examples: 1) Tom didn't like Mary. → [start] ha estoy qué 2) Tom called Mary and canceled their date. → [start] sola qué yo pasatiempo visto campo When fitting the model the following relevant warning was emitted: ``` UserWarning: The structure of `inputs` doesn't match the expected structure: ['encoder_inputs', 'decoder_inputs']. Received: the structure of inputs={'encoder_inputs': '*', 'decoder_inputs': '*'} ``` After the fix the model now outputs sentences that are close to proper Spanish: 1)That's what Tom told me. → [start] eso es lo que tom me dijo [end] 2) Does Tom like cheeseburgers? → [start] a tom le gustan las queso de queso [end] * Fix compute_mask in PostionalEmbedding The check essentially disables the mask calculation, as the layer is the first one to receive the input, and thus never has a previous. With this change mask is now passed on to the encoder. Looks like a regression error. The initial commit looks very similar to this. * Propagate both encoder/decoder-sequence masks to the decoder As per https://github.com/tensorflow/tensorflow/blob/6550e4bd80223cdb8be6c3afd1f81e86a4d433c3/tensorflow/python/keras/engine/base_layer.py#L965 the inputs should be a list, and not kwargs. When this is done, both the masks are received as a tuple in the mask argument. * Apply both the padding masks in the attention layers and during loss computation * Regenerate ipynb/md-files for NMT example
- Loading branch information
Showing
3 changed files
with
146 additions
and
90 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.