Transformations

If you've already had a go at making your own SentenceBuilder resources on SentenceBuilders.com, you are likely to have experienced issues with sentences which don't appear quite as you would like them to.

Maybe you've wanted to know how to include annotations in the SentenceBuilder but not have these appear in the sentences that it generates? e.g. markers for masculine and feminine, such as "(m)" or "(f)"

Perhaps you've wanted to know how to make the English items appear in the correct order, for things like nouns and adjectives? e.g. "une voiture rouge" = "a car red" ???

Or how to solve the word order issues in German, especially around verbs being sent to the end of the clause? e.g. "Ich bin ins Kino gegangen" = "I am to the cinema gone" ???

These problems occur because sentences are generated (in both the L1 and the L2) by simply combining the items as they appear in the SentenceBuilder, moving from left to right.

You need to add something extra -- beyond just the SentenceBuilder contents -- if you want your SentenceBuilder to generate sentences that work.

And that something extra is... 

(1) TRANSFORMATIONS

Transformations are a powerful tool: essentially a list of instructions to the SentenceBuilder program to make all sorts of changes to the raw output produced by your SentenceBuilder.

They are accessed via this section on the SentenceBuilder resource creator / editor page:

Transformations are added (by you!) as a list -- one transformation per line -- to the box that appears in the transformations pop-up (accessed by clicking on the "View / Edit transformations" button shown in the highlighted green section in the above image).

They are applied in the order that they appear, and their effect is cumulative, in the sense that the 1st transformation is applied to the raw text; the 2nd is applied to the result of the 1st; the 3rd is applied to the result of the 2nd, etc. until all transformations have been applied.

Transformations affect:

  1. Vocab chunks
  2. Sentences, which are broken down into:
    • Word-sentences (used for word-based activities)
    • Chunk-sentences (used for chunk-based activities)

You may not have been aware that there were 2 sets of Sentences generated by your SentenceBuilder. So let's start by explaining why that is so and how it works...

(2) WORDS VS. CHUNKS: THE 2 TYPES OF SENTENCES AND HOW THEY WORK

(2.1) "Word-sentences:"

Sentences are formed by combining items from the various cells that make up a viable route through a SentenceBuilder table. They work as follows. (Note that I will use an addition character "+" to represent cell boundaries...)

I went+to the cinema+with my brother

The items are joined together using a space " ", so the items above combine to form the sentence:

"I went to the cinema with my brother"

This is an example of a "word-sentence".

The word-based sentence activities break this word-sentence into sections by using the space " " to split it up, resulting in 8 words, which are then used in a variety of ways in all sorts of activities. e.g. word as gap, jumbled words, etc.

(2.2) "Chunk-sentences":

Chunk-based sentence activities, however, work in a slightly different way:

Our cell contents "I went+to the cinema+with my brother" first have all of the spaces converted to a # character, like this:

I#went+to#the#cinema+with#my#brother

Then these items are joined together using a space " ", giving the following "chunk-sentence":

"I#went to#the#cinema with#my#brother"

The above is an example of a "chunk-sentence".

The chunk-based sentence activities break this chunk-sentence into sections using the space " ". This results in 3 chunks, which are then used in a variety of ways in all sorts of activities. e.g. chunk as gap, jumbled chunks, etc.

It's important to understand the above so that you can understand how transformations work when applied to both word-sentences and chunk-sentences.

(3) TRANSFORMATIONS AFFECT VOCAB-CHUNKS, WORD-SENTENCES AND CHUNK-SENTENCES

Transformations are applied in the order in which they are listed, one by one, to all vocab-chunks (either as defined by default by the cell boundaries, or as defined by you using the vocab-chunks popup) and all sentences (word & chunk).

So from the examples given above, all transformations will be applied one by one to all of the following bits of text:

Vocab-chunks, e.g.:

I went
to the cinema
with my brother

Word-sentences, e.g.:

I went to the cinema with my brother

Chunk-sentences, e.g.:

I#went to#the#cinema with#my#brother

And remember: transformations are applied in the order that they appear, and their effect is cumulative, in the sense that the 1st transformation is applied to the raw text; the 2nd is applied to the result of the 1st; the 3rd is applied to the result of the 2nd, etc. until all transformations have been applied.

(4) SIMPLE TRANSFORMATIONS

At their simplest level, transformations use two "greater than" characters ">>" to instruct the program to make changes to the raw outputs, in the following way:

text before>>text after

i.e. in all vocab-chunks, word-sentences and chunk-sentences, transform all examples of "text before" to "text after"

(Here, "text before" and "text after" are used as examples to explain the process; as variables, if you like. In reality you will use actual bits of text from your SentenceBuilder.)

(4.1) A common use scenario:

Here's a common use scenario for transformations: switching the order of nouns and adjectives, in English sentences generated by SentenceBuilders designed to teach languages where the adjective goes after the noun, such as Spanish, French, Italian, etc.

For example, the Spanish SentenceBuilder items might be:

Vivo en+una casa+bonita

...and the English equivalent:

I live in+a house+pretty

The simplest way to fix this word order issue (but not necessarily the best! See the section on "look-ups" below...) is to add a simple transformation such as this:

house pretty>>pretty house

(4.2) Transforming the word-sentence:

The above transformation ("house pretty>>pretty house") does the following to the word-sentence:

"I live in a house pretty" becomes "I live in a pretty house"

(4.3) Transforming the chunk-sentence:

The above transformation ("house pretty>>pretty house") does the following to the chunk-sentence:

"I#live#in a#house pretty" becomes "I#live#in a#pretty#house"

Note that in the chunk-sentence above, we've gone from 3 chunks originally to 2 chunks following the transformation. This is because we have a transformation that crosses a cell boundary ("a house" from one cell combines with "pretty" from another cell). We're telling it to take the text from those two cells and to combine it. Since it has no way of knowing where any chunk-divisions need to occur, it combines it into one chunk. And remember: spaces are always represented by the # character in the chunks that make up chunk-sentences.

We can avoid this by using the underscore character _ on the right hand side of the transformation, to tell the algorithm to maintain the actual space. This would look like this:

house pretty>>pretty_house

This would still have the same effect on the word-sentence, because the _ is treated as a space, but the resulting chunk-sentence would look like this:

"I#live#in a#house pretty" becomes "I#live#in a#pretty house" (i.e. still 3 chunks)

N.B.: Your "text before" (i.e. the text on the left of your transformation) should be a maximum of 5 words!!

(5) LOOK-UPS

Look-ups allow you to write transformations which are much more targeted, since they only affect vocab-chunks, word-sentences and chunk-sentences which contain the text in the look-up.

Use a vertical pipe character "|" to add a look-up.

On my Spanish keyboard, the | character is accessed by using "alt gr" + number 1.
I believe it is available directly on a UK or US keyboard. If not, the combination of the "alt" key + 124 on the numberpad will produce the symbol.
On a Mac it can apparently be typed using "Alt" + "Shift" + "L".
You may need to look for an alternative way of doing it if none of the above works for you...

Everything to the left of the "|" is the look-up, and everything to the right of it is the corresponding transformation:

It works like this:

lookup|before>>after

This instruction means: if the text contains "lookup", change "before" to "after".

N.B. Here, "lookup", "before" and "after" are used as examples, as variables. You will use the appropriate bits of text from your SentenceBuilder...

Look-ups allow you to make changes in specific circumstances, rather than to all texts. If the text (sentence / chunk) doesn't contain the look-up item, the transformation is ignored.

(Some examples of look-ups are provided in the next section, "Markers", which also deals with look-ups.)

(6) MARKERS

You can add "markers" to your SentenceBuilder so that your look-ups and transformations are more precise, only affecting the items that you want to affect.

It's best to use characters that are not a natural part of the text items themselves. I tend to use the "middle dot" character as a marker. It looks like this · and I like it because (a) it is not likely to appear naturally in texts in most languages, and (b) it is fairly inobtrusive visually.

The middle dot · appears on the number 3 key on my Spanish keyboard (so shift + 3).
You can also type it by holding down the "alt" key and typing 250 or 0183 on the numberpad.
On a Mac it can apparently be typed by using "Opt"+"Shift"+9.

Markers can be added in a couple of ways:

  1. Add markers to the SentenceBuilder itself
  2. Add markers via transformations

A marker can be used as a look-up by itself, or to make a look-up or transformation more precise, in that it will include only those examples of that text that you wish to include and will ignore all those that you don't.

So for example, you might "mark" all of the feminine items, so that a transformation will only be applied to those feminine items.

Or you might mark a word that appears a lot in the SentenceBuilder, but you mark only those ones that you wish to transform, so that the other examples are not affected.

See this example Spanish SentenceBuilder (from the "Hacer" section of "Spanish Verb Pivots"):

You'll see that the English in the white cell in the middle says "I do/make". We don't want students to have to write that out each time when doing sentence activities from Spanish to English, so we want to transform that appropriately depending on the words that go with it.

While the resource contains several other transformations to do with correcting the English word order, below are the transformations that deal specifically with this "do/make" issue:

·|do/make>>make
question|do/make>>ask
trip|do/make>>go on
visit|do/make>>pay
do/make |do/make>>do
·>>

The first one of these uses the · marker as a look-up, which appears in the SentenceBuilder before all of the English items that require "make" rather than "do". Do you see the marker?

·|do/make>>make
i.e. if the text contains "·", change "do/make" to "make"
The next three use look-ups with items in the SentenceBuilder, changing the default "do/make" to a more appropriate verb for that particular collocation: "ask" (a question), "go on" (a trip), "pay" (a visit)

The fifth one deals with the remainder of cases (i.e. where we want "hacer" to be translated as "do"). It uses a look-up "do/make |" (i.e. with a trailing space in the look-up) so as to avoid transforming "I do/make" to "I do" for the vocab-chunk, which we want to keep as "hago" = "I do/make". (Since the vocab-chunk is only based on the content of that cell -- "I do/make" -- a look-up for the text "do/make " (with the trailing space) will fail, and no transformation will occur on the vocab chunk.)

The last one is just tidying up. It removes all of the · characters from any vocab-chunks or sentences that may contain it.

(7) EFFICIENT TRANSFORMATIONS (COMBINING MARKERS AND LOOK-UPS)

Markers and look-ups can be combined to achieve more efficient transformations and to avoid repetition. Let's take the earlier Spanish example of the noun + adjective in the English translation being transformed to adjective + noun.

The example we used above was:

"house pretty>>pretty house" (or "house pretty>>pretty_house" if we wanted to avoid the chunks being combined for chunk-sentences)

...and this worked fine for a single noun and adjective pair.

But what if we have a situation where we need to transform multiple nouns and adjectives, as in the example below (taken from Spanish Primary part 1)?

Here we have a list of 16 animals + 11 colours or adjectives describing them. So using the same format as the simple transformation above would require us to write out 16 x 11 sets of transformations. i.e. a total of 176 transformations!

Quite apart from the fact that this exceeds the maximum number of transformations allowed (which is 150), it's way too much data, way too time-consuming, and it will have a significant impact on page load, due to those 176 transformations having to be applied to all vocab-chunks, word-sentences and chunk-sentences (in both languages!!).

By using look-ups and markers, however, we can make this much more efficient. The resource shown above has a total of 28 transformations, which I've grouped into blocks below so that I can explain what's happening in each block.

(Remember: transformations are applied to all texts  -- i.e. chunks / sentences -- in the order that they appear, and their effect is cumulative, in the sense that the 1st transformation is applied to the raw text; the 2nd is applied to the result of the 1st; the 3rd is applied to the result of the 2nd, etc. until all transformations have been applied.)

(7.1) Transformations 1 and 2:

The first 2 transformations don't involve a look-up:

a (m)>>a-
a (f)>>a-

Here, we've achieved 2 things. Firstly, we don't need "(m)" and "(f)" from the SentenceBuilder to appear in any of the chunks or sentences next to "a", so we've removed them. Secondly, we'de added a hyphen "-" as a marker (having taken care to assure that "a-" doesn't naturally occur elsewhere in the SentenceBuilder) so that we can transform it, but without affecting any of the letter "a"s that appear in the Spanish, for example (such as the ending of feminine adjectives or nouns).

Following this transformation, our texts have been transformed in the following way:

"a (m) dog blue" becomes "a- dog blue"

(7.2) Transformations 3 to 24:

Next we have essentially the same pair of transformations appearing 11 times, once for each of the adjectives whose position we need to transform:

blue|a->>a blu*e
blu*e| blue>>
brown|a->>a bro*wn
bro*wn| brown>>
orange|a->>an ora*nge
ora*nge| orange>>
pink|a->>a pi*nk
pi*nk| pink>>
green|a->>a gre*en
gre*en| green>>
big|a->>a bi*g
bi*g| big>>
yellow|a->>a yell*ow
yell*ow| yellow>>
white|a->>a whi*te
whi*te| white>>
black|a->>a bla*ck
bla*ck| black>>
red|a->>a r*ed
r*ed| red>>
small|a->>a sm*all
sm*all| small>>

I'll take the first pair and explain what's happening. The 1st half of the pair is this:

blue|a->>a blu*e
i.e. if the text contains "blue", change "a-" to "a blu*e"

Note that we've added an asterisk "*" as a marker. This is required so that the 2nd transformation in the pair (below) will work properly. Following this 1st transformation in the pair, we have achieved the following:

"a- dog blue" becomes "a blu*e dog blue"

The 2nd half of the pair of transformations is this:

blu*e| blue>>
i.e. if the text contains "blu*e", change " blue" to nothing

This removes the " blue" (note the space before the word!) from our sentence so far -- "a blu*e dog blue" -- leaving just "a blu*e dog".

So the point of adding the asterisk as a marker in the 1st half of the pair is that it allows us to remove the one occurence of the word "blue" that we no longer want. (Without it, " blue>>" would have removed both occurences of the adjective.)

That pair is repeated for all of the adjectives.

(7.3) Transformations 25 and 26:

#(m)>>
#(f)>>

These two remove the " (m)" and the " (f)" from all of the adjectives, since these are not required in the chunks and sentences themselves and are only there for info on the SentenceBuilder.

Note that I've left the # character at the start of the line. This is treated as a space (i.e. the same as " ") but it means that I can see straight away that the space has been added before the text.

(7.4) Transformations 27 and 28:

*>>
->>

Tidying up, essentially. We're removing markers that we've added in previous steps. Those 2 transformations say, respectively, replace "*" with nothing, and replace "-" with nothing.

We remove the "*" so that our adjectives say "blue", "green" etc, rather than "blu*e" and "gre*en".

And we remove the "-" because this resource has vocab-chunk definitions that don't include the adjectives (e.g. "Tengo un perro" / "I have a dog"), so none of the adjective transformations above will have been applied, leaving things like "I have a- dog". We need to remove the "-" to leave "I have a dog".

(8) A FEW COMMON USE SCENARIOS

There are all sorts of reasons why you may wish or need to apply transformations to a SentenceBuilder resource. Below are some common ones, the sorts of things I get asked about quite regularly.

(8.1) Switching nouns and adjectives (word order):

See the complete example provided in section 7 above.

(8.2) German verb sent to end of sentence:

See this blog post which deals specifically with this:
Direct translations + word order transformations: a German example

(8.3) Adding apostrophe (elision):

Imagine we have content from 2 adjacent cells like this:

"parce que+il aime ça"

We have several options for dealing with this. The first is a simple transformation, such as this:

que il>>qu'il

The result of this transformation on the word-sentence would be:

"parce que+il aime ça" becomes "parce qu'il aime ça"

The result for the chunk-sentence would be:

"parce#que+il#aime#ça" becomes "parce#qu'il#aime#ça" (i.e. a single chunk)

Alternatively, we could keep that piece of text as 2 chunks in the chunk-sentence by using a transformation which includes an underscore _ to maintain the space, like this:

parce que il>>parce_qu'il
i.e. keep the space after "parce", so we'd end up with "parce qu'il#aime#ça" (2 chunks, as before)

...or even:

que il >>qu'il_ 
i.e. keep the space after "il", so we'd have "parce#qu'il aime#ça" (again, 2 chunks, as before)
(Note the space after "il" on the left of the transformation!)

Or we could use a look-up, like this:

que il|parce que>>parce
parce il|il>>qu'il

To be honest, this last one is basically just a more complex way of achieving the same as this, from above:
"parce que il>>parce_qu'il"

(8.4) Correctly positioning punctuation:

Sometimes we might want to include a punctuation mark in the cell following the one where you would more naturally put it. Here is a good example from German (from our German Primary part 1):

In German, we have to include a comma before the word "weil" (="because"), but we have the problem that some sentences are affirmative and some are negative, so we can't include the comma in the cells to the left of "weil". Instead it's been included as ", weil sie" and ", weil es".

Now, we know from the information in section 2 above that the cell contents are joined using a space " ", so for the image shown, it would go something like this:

"Ich mag+meine Stadt+, weil sie..." becomes "Ich mag meine Stadt , weil sie..."

The problem we have here is that there is an additional space before the comma. And unless we fix this, students will have to write out that extra space in their activities so as not to lose points.

The way we fix it is like this:

#, >>,_

The # isn't necessary, I could have simply used a space. But I've included it here so that it's clear that there is a space at the start of the transformation.

The _ instructs the program to leave a space after the comma, even for chunk-sentences. So it has the affect of adding the comma to the previous chunk.

(8.5) Maintaining English infinitive for vocab-chunks, removing "to" for sentences:

Sometimes we want to include the "to" in the English infinitive so that the vocab chunk makes sense, but we may need to remove that "to" depending on what goes before it in English. See this example:

On the one hand, we want our English translation of the vocab-chunk "arreglar mi habitación" to say "to tidy my room". On the other hand, we know that we can't have sentences that say "I can to tidy...", "I usually to tidy..." or "I have to to tidy". (And we need "tengo que" to be translated as "I have to", not just "I have", hence the extra "to"...)

You'll see that I've included a middle point marker "·" before each "to" in the SentenceBuilder itself. It is there so that we can fix the English in the combinations listed above.

Here are the transformations:

to ·to|·to >>
usually ·to|·to >>
can ·to|·to >>
·>>

The 1st one deals with "I have to to". It says: if the sentence contains "to ·to", change "·to " to nothing, thereby removing the "·to " before all of those infinitives in sentences containing "I have to". The middle point marker is needed here so that we can distinguish between the two occurences of "to".

The 2nd and 3rd ones deal with "usually· and "can", essentially in the same way as above.

The 4th one tidies up, removing any middle point characters that may be left anywhere, since we never want this to appear in any of the vocab-chunks, word-sentences or chunk-sentences.

(9) REFERENCE: SPECIAL CHARACTERS USED FOR TRANSFORMATIONS

  • >> to indicate transformation
    As used in all of the examples above, use >> to mean "changes to".
    this>>that (this changes to that)

  • | for look-ups
    See the section above on "look-ups" and "markers"

  • # to show space at beginning or end of line
    If included in a transformation it is converted to a space.
    A useful (but non-essential) method of making it clear to yourself that your transformation has a space at the beginning or end

  • _ to maintain space when combining chunks
    As mentioned in section 4, "Simple Transformations", above.

(10) A FEW GOLDEN RULES TO REMEMBER

  • 1 by 1, and cumulative
    All transformations are applied one by one, in the order listed, and their effect is cumulative, in the sense that the 1st transformation is applied to the raw text; the 2nd is applied to the result of the 1st; the 3rd is applied to the result of the 2nd, etc. until all transformations have been applied.

  • Maximum 5 words on left of transformation
    In the equation "before>>after", your "before" must NOT contain more than 5 words. If the text that you wish to transform is longer than 5 words, you'll have to use a different technique (markers, look-ups, etc) to achieve the desired effect.

  • Single-chunk sentence will be broken into words
    If the result of a transformation is a chunk-sentence which is essentially a single chunk (i.e. no spaces at all, only # divisions), the chunks-sentence will in fact be split into its component words, so as to prevent erroring on the chunk-based sentence activities.

  • Maximum 150 transformations allowed
    You should ideally aim for way fewer than this though. Remember to combine look-ups and markers to achieve more efficient transformations.

  • Check impact on all 3 sets of data
    Remember to look at all 3 sets of text data generated by your SentenceBuilder to check that it is outputting things correctly. Those 3 sets of text data are:
            (1) vocab-chunks
            (2) word-sentences
            (3) chunk-sentences
    You can check all 3 of these on the resource creator / editor page.

  • No one way to do it
    There are many different ways of achieving the same effect. The approach that you take will depend on context: the layout and composition of your particular SentenceBuilder.

(11) PDF SUMMARY...

Here is a handy summary of the main points above. But please refer back to this guide for full details and examples of usage :)