2.19
2 comments
Created by mumble 4 months, 1 week ago
[ 2.19 | 0.00 ] [#2152]

The goal of this was to replicate the idea of offline learning some sequences, then given an input sequence predict which sequence it belongs to,
with some tolerance for noise. And that is it. With the learnt sequences represented using HTM like mini-columns providing each digit with its own
context. For example, the start of the sequence Pi 3.14159 is encoded as 3' 1' 4' 1'' 5' 9', ignoring the decimal point for simplicity, where A' is
the mini-column version of the SDR for A. Though to be clear, our SDR's have coefficients of floats, not {0,1}, but conceptually, and in terms of
properties, they are essentially identical to binary SDR's.

The approximate mathematics this is doing:
Given an input sequence |v1 . v2 . v3>
find x such that f(x) approx-eq v1 and f(x + delta) approx-eq v2 and f(x + 2*delta) approx-eq v3
where the exact meaning of "a approx-eq b" is a consequence of how you define your encode operator.
Indeed, presumably, if we used an encoder that maps words to SDR's such as cortical.io do, then
an input sequence of "the tiger ate a sheep" should match a learnt sequence of "the lion ate a lamb".
And something similar for learning and recalling melodies.

Here is the definition of our scalar encoder, which probably doesn't make too much sense, but approximates a Gaussian:

encode |*> #=> rescale smooth[0.1]^10 |_self>

If we want a binary SDR instead we could perhaps use:

encode |*> #=> clean smooth[0.1]^5 |_self>

but we will stick with our Gaussian for now.

In particular, here is our scalar encoding of "10":

sa: encode |10>
0.0|9> + 0.0|9.1> + 0.001|9.2> + 0.006|9.3> + 0.026|9.4> + 0.084|9.5> + 0.21|9.6> + 0.42|9.7> + 0.682|9.8> + 0.909|9.9> + |10> + 0.909|10.1> + 
0.682|10.2> + 0.42|10.3> + 0.21|10.4> + 0.084|10.5> + 0.026|10.6> + 0.006|10.7> + 0.001|10.8> + 0.0|10.9> + 0.0|11>

This encoder has nice similarity properties, with respect to our similarity measure (we don't use dot product):

sa: ket-simm(encode |10>, encode |10>)
|simm>

sa: ket-simm(encode |10>, encode |10.5>)
0.263|simm>

sa: ket-simm(encode |10>, encode |11>)
0.027|simm>

sa: ket-simm(encode |10>, encode |12>)
0.0|simm>

But depending on what you are doing you might want your Gaussians to be wider. That's for the future.

Given a scalar encoder we can now offline learn some sample digits of our two sequences Pi and e. See end of post(1) for full details. But the idea
should generalize easily to any sequence of floats. Though as a consequence of our current proof of concept scalar encoder the floats must be
limited to one decimal place. Recall, this is just a toy for now! But should be fixable when we implement a full Gaussian encoder. Indeed, a full
Gaussian encoder should also enable sequences of 2D or 3D co-ordinates too, which are potentially more interesting. Again, details left for the
future.

How does our code represent sequences? Basically we use a very simplified model of a neuron: given an input SDR, predict an output SDR. And then
use a chain of these to represent full sequences. Note that the idea of mini-columns is critical to this, which in my code is implemented using the
random-column[10] operator. Otherwise we couldn't represent sequences with repeat digits, and we could only represent one sequence at a time.
So, how does random-column[k] work? For a ket with D dimensions it maps that ket to D+1 dimensions, with a random value in the new dimension, in
range {0,1,... ,k-1}. The effect being that our SDR's are unique each time we use them in a sequence. cf. HTM theory.

Here we have 1D kets mapped to 2D:

sa: random-column[10] (|x1> + |x2> + |x3>)
|x1: 7> + |x2: 0> + |x3: 3>

Here we have 2D kets mapped to 3D:

sa: random-column[10] (|x1: y1> + |x2: y2> + |x3: y3>)
|x1: y1: 2> + |x2: y2: 9> + |x3: y3: 9>

And it is random, so each invoke provides a different mapping:

sa: random-column[10] (|x1> + |x2> + |x3>)
|x1: 4> + |x2: 6> + |x3: 4>

sa: random-column[10] (|x1: y1> + |x2: y2> + |x3: y3>)
|x1: y1: 5> + |x2: y2: 1> + |x3: y3: 8>

And like HTM theory, the probability of a collision is fairly small, and can be made smaller by increasing k. By the way, we can undo this mapping
by using the extract-category operator, and we use this in our code, but I won't mention details here.

Finally to some examples.
Let's input a sequence of a single integer. Let's try '2', and then '3':

sa: float-sequence |2>
e   1.0    |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi  1.0    |2> . |6> . |5> . |3> . |5>
e   1.0    |2> . |8> . |1> . |8> . |2> . |8> . |4>
e   1.0    |2> . |8> . |4>
Pi  0.071  |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.071  |3> . |5>
e   0.071  |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi  0.071  |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.071  |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e   0.071  |1> . |8> . |2> . |8> . |4>
Pi  0.0    |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e   0.0    |4>
|float-sequence>

sa: float-sequence |3>
Pi  1.0    |3> . |5>
Pi  1.0    |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.071  |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.071  |2> . |6> . |5> . |3> . |5>
e   0.071  |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e   0.071  |2> . |8> . |4>
e   0.071  |4>
e   0.071  |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi  0.0    |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.0    |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e   0.0    |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e   0.0    |1> . |8> . |2> . |8> . |4>
|float-sequence>

Where, the first column is the name of the predicted sequence. The second is the similarity score. The third is a walk of the matched sequence, and
hence the sequence prediction given the input.
Let's try again, but this time with a non-integer:

sa: float-sequence |2.5>
Pi  0.619  |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.619  |3> . |5>
e   0.619  |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi  0.619  |2> . |6> . |5> . |3> . |5>
e   0.619  |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e   0.619  |2> . |8> . |4>
Pi  0.001  |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.001  |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.001  |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e   0.001  |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e   0.001  |4>
e   0.001  |1> . |8> . |2> . |8> . |4>
|float-sequence>

And we see '2.5' matches '3' and '2' with the same probability, just as you would expect.
Now again, but with a longer input sequence:

sa: float-sequence |3.3 . 1 . 4.2>
Pi  0.664  |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.078  |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi  0.0    |5> . |3> . |5>
|float-sequence>

Given the longer input sequence, we have now converged down to 3 matching sequences. The first one is a 66.4% match, since 3.3 approx-eq 3, 1 == 1
and 4.2 approx-eq 4. The second one a 7.8% match, since 3.3 approx-eq 4, 1 == 1, and 4.2 approx-eq 5. The third one is a 0% match, though it must
be slightly above 0 or else it wouldn't be displayed.

If we add one more digit we can filter down to a unique sequence:
Eg:

sa: float-sequence |3.3 . 1 . 4.2 . 1>
Pi  0.664  |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
|float-sequence>

Or:

sa: float-sequence |3.3 . 1 . 4.2 . 8.7>
Pi  0.078  |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
|float-sequence>

And finally, if the input sequence is too far from all the learnt sequences we get the empty, or don't know, ket |>.

sa: float-sequence |7 . 7 . 7>
|>

In HTM, this would probably be the point at which you alert an anomaly. ie, no matching sequence.

Now, another example, this time using words not floats, but using the same back-end code. And instead of the scalar encoder, we map words to random
SDR's. See (2). So there should be no similarity between different words. It's kind of black and white this time, unlike our float example.
We would need something like cortical.io if we wanted similarity between words.

The two sentences we have learnt are (borrowed from this HTM school video https://www.youtube.com/watch?v=UBzemKcUoOk):
"boys eat many cakes" and "girls eat many pies".

And correspondingly we have, after loading the data into memory:

sa: float-sequence |boys>
boy sentence  1.0  |boys> . |eat> . |many> . |cakes>
|float-sequence>

sa: float-sequence |girls>
girl sentence  1.0  |girls> . |eat> . |many> . |pies>
|float-sequence>

And similarly we have:

sa: float-sequence |eat>
boy sentence   1.0  |eat> . |many> . |cakes>
girl sentence  1.0  |eat> . |many> . |pies>
|float-sequence>

sa: float-sequence |many>
boy sentence   1.0  |many> . |cakes>
girl sentence  1.0  |many> . |pies>
|float-sequence>

sa: float-sequence |eat . many>
boy sentence   1.0  |eat> . |many> . |cakes>
girl sentence  1.0  |eat> . |many> . |pies>
|float-sequence>

And that's it. Note that this is a toy system, so I don't think it's actually useful for anything, except perhaps demonstration of some ideas.
Thanks for your time.
Originally posted here.

(1): learning our two sequences:
---------------------------------------------------------
-- random encode the "end of sequence" marker:
full |range> => range(|1>,|2048>)
encode |end of sequence> => pick[10] full |range>

-- define our proof of concept scalar encoder:
encode |*> #=> rescale smooth[0.1]^10 |_self>

-- learn the scalar encodings for the digits:
encode |0> => encode |0>
encode |1> => encode |1>
encode |2> => encode |2>
encode |3> => encode |3>
encode |4> => encode |4>
encode |5> => encode |5>
encode |6> => encode |6>
encode |7> => encode |7>
encode |8> => encode |8>
encode |9> => encode |9>

-- learn the sequence of digits of Pi:
-- Pi
-- 3 1 4 1 5 9 2 6 5 3 5
-- name the sequence:
sequence-name |node: 1: *> => |Pi>

pattern |node: 1: 1> => random-column[10] encode |3>
then |node: 1: 1> => random-column[10] encode |1>

pattern |node: 1: 2> => then |node: 1: 1>
then |node: 1: 2> => random-column[10] encode |4>

pattern |node: 1: 3> => then |node: 1: 2>
then |node: 1: 3> => random-column[10] encode |1>

pattern |node: 1: 4> => then |node: 1: 3>
then |node: 1: 4> => random-column[10] encode |5>

pattern |node: 1: 5> => then |node: 1: 4>
then |node: 1: 5> => random-column[10] encode |9>

pattern |node: 1: 6> => then |node: 1: 5>
then |node: 1: 6> => random-column[10] encode |2>

pattern |node: 1: 7> => then |node: 1: 6>
then |node: 1: 7> => random-column[10] encode |6>

pattern |node: 1: 8> => then |node: 1: 7>
then |node: 1: 8> => random-column[10] encode |5>

pattern |node: 1: 9> => then |node: 1: 8>
then |node: 1: 9> => random-column[10] encode |3>

pattern |node: 1: 10> => then |node: 1: 9>
then |node: 1: 10> => random-column[10] encode |5>

pattern |node: 1: 11> => then |node: 1: 10>
then |node: 1: 11> => random-column[10] encode |end of sequence>


-- learn the sequence of digits of e:
-- e
-- 2 7 1 8 2 8 1 8 2 8 4
-- name the sequence:
sequence-name |node: 2: *> => |e>

pattern |node: 2: 1> => random-column[10] encode |2>
then |node: 2: 1> => random-column[10] encode |7>

pattern |node: 2: 2> => then |node: 2: 1>
then |node: 2: 2> => random-column[10] encode |1>

pattern |node: 2: 3> => then |node: 2: 2>
then |node: 2: 3> => random-column[10] encode |8>

pattern |node: 2: 4> => then |node: 2: 3>
then |node: 2: 4> => random-column[10] encode |2>

pattern |node: 2: 5> => then |node: 2: 4>
then |node: 2: 5> => random-column[10] encode |8>

pattern |node: 2: 6> => then |node: 2: 5>
then |node: 2: 6> => random-column[10] encode |1>

pattern |node: 2: 7> => then |node: 2: 6>
then |node: 2: 7> => random-column[10] encode |8>

pattern |node: 2: 8> => then |node: 2: 7>
then |node: 2: 8> => random-column[10] encode |2>

pattern |node: 2: 9> => then |node: 2: 8>
then |node: 2: 9> => random-column[10] encode |8>

pattern |node: 2: 10> => then |node: 2: 9>
then |node: 2: 10> => random-column[10] encode |4>

pattern |node: 2: 11> => then |node: 2: 10>
then |node: 2: 11> => random-column[10] encode |end of sequence>



(2): learning our two sentences:
---------------------------------------------------------
-- random encode the "end of sequence" marker:
full |range> => range(|1>,|65535>)
encode |end of sequence> => pick[10] full |range>

-- learn the random encodings for the words:
encode |boys> => pick[10] full |range>
encode |eat> => pick[10] full |range>
encode |many> => pick[10] full |range>
encode |cakes> => pick[10] full |range>
encode |girls> => pick[10] full |range>
encode |pies> => pick[10] full |range>


-- learn "boys eat many cakes":
-- boy sentence
-- name the sequence:
sequence-name |node: 1: *> => |boy sentence>

pattern |node: 1: 1> => random-column[10] encode |boys>
then |node: 1: 1> => random-column[10] encode |eat>

pattern |node: 1: 2> => then |node: 1: 1>
then |node: 1: 2> => random-column[10] encode |many>

pattern |node: 1: 3> => then |node: 1: 2>
then |node: 1: 3> => random-column[10] encode |cakes>

pattern |node: 1: 4> => then |node: 1: 3>
then |node: 1: 4> => random-column[10] encode |end of sequence>


-- learn "girls eat many pies":
-- girl sentence
-- name the sequence:
sequence-name |node: 2: *> => |girl sentence>

pattern |node: 2: 1> => random-column[10] encode |girls>
then |node: 2: 1> => random-column[10] encode |eat>

pattern |node: 2: 2> => then |node: 2: 1>
then |node: 2: 2> => random-column[10] encode |many>

pattern |node: 2: 3> => then |node: 2: 2>
then |node: 2: 3> => random-column[10] encode |pies>

pattern |node: 2: 4> => then |node: 2: 3>
then |node: 2: 4> => random-column[10] encode |end of sequence>

[ Reply ]

0.00
Created by United_Fools 4 months, 1 week ago
[ 0.00 | 0.00 ] [#10443]

did some deep net invent this human unreadable jumbo gumbo above?


[ Parent | Reply ]


1.11
Created by kr5dit 4 months, 1 week ago
[ 1.11 | 0.00 ] [#10446]

It's invented its own language! Someone put mumble on a fixed supervised model instead! It's the only way to be sure!


[ Parent | Reply ]